Use of Multifrequency Channel Decomposition for Speech Recognition
نویسندگان
چکیده
In speech recognition, recognition performance is usually affected by the confusing set in the vocabulary. One possible way to improve the discriminability is to decompose speech signal into multifrequency channels with different weights. In this paper, speech signal is decomposed into multiple spatial frequency channels using wavelet transform and filter banks, respectively. Speech signal in each channel is then used to calculate the LPC-derived cepstral coefficients. For each channel, a Bayesian network is adopted to model speech features. Finally, a channel weighting method is used to emphasize the contributions of different channels. For experimental evaluation, recognition of English E-set and 200 city names provided by 20 speakers were used to evaluate the proposed method. The experimental results show that multifrequency channel decomposition approach achieves a better performance compared to the conventional single-channel method. In addition, the wavelet transform and filter bank approaches have comparable performance.
منابع مشابه
Multifrequency channel decompositions of images and wavelet models
In this paper we review recent multichannel models developed in psychophysiology, computer vision, and image processing. In psychophysiology, multichannel models have been particularly successful in explaining some low-level processing in the visual cortex. The expansion of a function into several frequency channels provides a representation which is intermediate between a spatial and a Fourier...
متن کاملEvaluation of model adaptation by HMM decomposition on telephone speech recognition
In this paper, we evaluate performance of model adaptation by the previously proposed HMM decomposition method[1] on telephone speech recognition. The HMM decomposition method separates a composed HMM into a known phoneme HMM and an unknown noise and channel HMM by maximum likelihood (ML) estimation of the HMM parameters. A transfer function (telephone channel) HMM is estimated using adaptation...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کامل